NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Physics-Regulated Deep Reinforcement Learning: Invariant Embeddings

Cao, Hongpeng; Mao, Yanbing; Sha, Lui; Caccamo, Marco (May 2024, The Twelfth International Conference on Learning Representations)

This paper proposes the Phy-DRL: a physics-regulated deep reinforcement learning (DRL) framework for safety-critical autonomous systems. The Phy-DRL has three distinguished invariant-embedding designs: i) residual action policy (i.e., integrating data-driven-DRL action policy and physics-model-based action policy), ii) automatically constructed safety-embedded reward, and iii) physics-model-guided neural network (NN) editing, including link editing and activation editing. Theoretically, the Phy-DRL exhibits 1) a mathematically provable safety guarantee and 2) strict compliance of critic and actor networks with physics knowledge about the action-value function and action policy. Finally, we evaluate the Phy-DRL on a cart-pole system and a quadruped robot. The experiments validate our theoretical results and demonstrate that Phy-DRL features guarantee safety compared to purely data-driven DRL and solely model-based design, while offering remarkably fewer learning parameters and fast training towards safety guarantee.
more » « less
Full Text Available
Physics-Model-Regulated Deep Reinforcement Learning Towards Safety & Stability Guarantees

https://doi.org/10.1109/CDC49753.2023.10383560

Cao, Hongpeng; Mao, Yanbing; Sha, Lui; Caccamo, Marco (December 2023, IEEE Conference on Decision and Control)

Deep reinforcement learning (DRL) has demonstrated impressive success in solving complex control tasks by synthesizing control policies from data. However, the safety and stability of applying DRL to safety-critical systems remain a primary concern and challenging problem. To address the problem, we propose the Phy-DRL: a novel physics-model regulated deep reinforcement learning framework. The Phy-DRL is novel in two architectural designs: a physics-model-regulated reward and residual control, which integrates physics-model-based control and data-driven control. The concurrent designs enable the Phy-DRL to mathematically provable safety and stability guarantees. Finally, the effectiveness of the Phy-DRL is validated by an inverted pendulum system. Additionally, the experimental results demonstrate that the Phy-DRL features remarkably accelerated training and enlarged reward.
more » « less
Phy-Taylor: Partially Physics-Knowledge-Enhanced Deep Neural Networks via NN Editing

https://doi.org/10.1109/TNNLS.2023.3325432

Mao, Yanbing; Gu, Yuliang; Sha, Lui; Shao, Huajie; Wang, Qixin; Abdelzaher, Tarek (January 2024, IEEE Transactions on Neural Networks and Learning Systems)

Full Text Available
Sℒ 1-Simplex: Safe Velocity Regulation of Self-Driving Vehicles in Dynamic and Unforeseen Environments

https://doi.org/10.1145/3564273

Mao, Yanbing; Gu, Yuliang; Hovakimyan, Naira; Sha, Lui; Voulgaris, Petros (January 2023, ACM Transactions on Cyber-Physical Systems)
Tarek Abdelzaher, Karl-Erik Arzen (Ed.)
This article proposes a novel extension of the Simplex architecture with model switching and model learning to achieve safe velocity regulation of self-driving vehicles in dynamic and unforeseen environments. To guarantee the reliability of autonomous vehicles, an ℒ₁adaptive controller that compensates for uncertainties and disturbances is employed by the Simplex architecture as a verified high-assurance controller (HAC) to tolerate concurrent software and physical failures. Meanwhile, the safe switching controller is incorporated into the HAC for safe velocity regulation in the dynamic (prepared) environments, through the integration of the traction control system and anti-lock braking system. Due to the high dependence of vehicle dynamics on the driving environments, the HAC leverages the finite-time model learning to timely learn and update the vehicle model for ℒ₁adaptive controller, when any deviation from the safety envelope or the uncertainty measurement threshold occurs in the unforeseen driving environments. With the integration of ℒ₁adaptive controller, safe switching controller and finite-time model learning, the vehicle’s angular and longitudinal velocities can asymptotically track the provided references in the dynamic and unforeseen driving environments, while the wheel slips are restricted to safety envelopes to prevent slipping and sliding. Finally, the effectiveness of the proposed Simplex architecture for safe velocity regulation is validated by the AutoRally platform.
more » « less
Full Text Available
Impact of Confirmation Bias on Competitive Information Spread in Social Networks

https://doi.org/10.1109/TCNS.2021.3050117

Mao, Yanbing; Akyol, Emrah; Hovakimyan, Naira (June 2021, IEEE Transactions on Control of Network Systems)
null (Ed.)
Full Text Available
Robust Adaptive Quadratic Programs using Control Lyapunov and Barrier Functions

Zhao, Pan Zhao; Mao, Yanbing Mao; Tao, Chuyuan; Hovakimyan, Naira; Wang, Xiaofeng (December 2020, IEEE Conference on Decision and Control)
null (Ed.)
Full Text Available
On Inference of Network Topology and Confirmation Bias in Cyber-Social Networks

https://doi.org/10.1109/TSIPN.2020.3015283

Mao, Yanbing; Akyol, Emrah (January 2020, IEEE Transactions on Signal and Information Processing over Networks)
null (Ed.)
Full Text Available
Detectability of Intermittent Zero-Dynamics Attack in Networked Control Systems

https://doi.org/10.1109/CDC40024.2019.9029997

Mao, Yanbing; Jafarnejadsani, Hamidreza; Zhao, Pan; Akyol, Emrah; Hovakimyan, Naira (December 2019, IEEE Conference on Decision and Control)

Full Text Available
Novel Stealthy Attack and Defense Strategies for Networked Control Systems

https://doi.org/10.1109/TAC.2020.2997363

Mao, Yanbing; Jafarnejadsani, Hamidreza; Zhao, Pan; Akyol, Emrah; Hovakimyan, Naira (August 2019, IEEE Transactions on Automatic Control)

Full Text Available

Search for: All records